<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <meta content="text/html; charset=ISO-8859-1" http-equiv="content-type">
  <title>Space Requirements for the ARCo database</title>
</head>

<body>
<h1>
   <font color="#336699">
   Space Requirements for the ARCo database
   </font>
</h1>

<p>The Grid Engine product includes a tool called the Accounting
and Reporting Console (ARCo), which stores
reporting data from the qmaster in a relational database.<br>
The attached <a href="#spreadsheet">spreadsheet</a> is intended to guide the administrator in
estimating the space requirements for such a database.<br>
The following calculations are meant to be simple guidelines and by no
means work for every case. The reader is encouraged to consult other
database configuration literature to complement the information found
in this document.</p>

<p>The ARCo module of the Grid Engine product consists of the dbwriter and the
reporting module (a web application which runs inside the Java(TM) Web
Console). When the reporting functionality is enabled, the qmaster
writes reporting data into the reporting file, located at
<i>&lt;SGE_ROOT&gt;/&lt;cell&gt;/common/reporting</i>. This file contains raw
values in a format described in the
<a href="http://gridengine.sunsource.net/nonav/source/browse/~checkout~/gridengine/doc/htmlman/htmlman5/reporting.html">reporting(5)</a>
man page. Which information is written into the reporting file can be configured with
the reporting_params parameter of the qmaster configuration (See
<a href="http://gridengine.sunsource.net/nonav/source/browse/~checkout~/gridengine/doc/htmlman/htmlman5/sge_conf.html">sge_conf(5)</a>
).<br>
The dbwriter module performs the following tasks:</p>

<h2>
<font color="#336699">
   Importing of the reporting file
</font>
</h2>
<p>The dbwriter module periodically looks for the reporting file. If the file
exists, it will be renamed to reporting.processing, and the contents of
the file (the raw values) will be imported into the database. After the file
reporting.processing is completely processed, it is deleted by the dbwriter.</p>

<h2>
   <font color="#336699">
      Calculation of derived values
   </font>
</h2>
<p>Based on the raw values stored in the database, the dbwriter
module calculates a set of derived values. The rules for the derived
values are defined in the calculation file (By default at
<i>&lt;SGE_ROOT&gt;/&lt;CELL&gt;/dbwriter/&lt;database
type&gt;/dbwriter.xml</i>. </p>
<h3>Example:</h3>
<p>The following derived value rule calculates the derived value "h_load"
(hourly load) which is an average of the raw value "np_load_avg". The
calculation is performed every hour. The resulting derived values are stored
in the same database table from which they were calculated. In this case the
sge_host_values.</p>
<pre>
   &lt;!-- average load per hour --&gt;
   &lt;derive object="host" interval="hour" variable="h_load"&gt;
      &lt;auto function="AVG" variable="np_load_avg" /&gt;
   &lt;/derive&gt;
</pre>

<h2>
   <font color="#336699">
      Deletion of outdated values
   </font>
</h2>
<p>Deletion rules are also defined in the calculation file. A deletion rule
defines how long a raw or derived value stays in the database. If
correctly configured, the deletion rules keeps the database at an
approximately constant size.</p>
<h3>Example:</h3>
<p>The following example deletes all the records from the sge_host_values table
where "hv_variable" equals "np_load_avg" and the values are older than 7 days.</p>

<pre>
   &lt;delete scope="host_values" time_range="day" time_amount="7"&gt;
      &lt;sub_scope&gt;np_load_avg&lt;/sub_scope&gt;
   &lt;/delete&gt;
</pre>

<p>For detailed information please refer the
<a href="http://docs.sun.com/app/docs/doc/820-0698/frjtz">
Sun Grid Engine Administration Guide -- Section: Configuring DBWriter.</a></p>

<h2>
   <font color="#336699">
      Default configuration
   </font>
</h2>
<p>The following table shows all the raw and derived values with their
associated lifetimes, assuming the default configuration of the
dbwriter module is used.</p>
<div class="informaltable">
<table border="1">
  <colgroup><col><col><col><col></colgroup><tbody>
    <tr>
      <td><b>Database Table</b></td>
      <td><b>Interval</b></td>
      <td><b>Variable</b></td>
      <td><b>Lifetime</b></td>
    </tr>
    <tr>
      <td align="left">department_values </td>
      <td align="center"> * </td>
      <td align="center"> * </td>
      <td align="center"> 2 years </td>
    </tr>
    <tr>
      <td align="left">group_values </td>
      <td align="center"> * </td>
      <td align="center"> * </td>
      <td align="center"> 2 years </td>
    </tr>
    <tr>
      <td align="left">host_values </td>
      <td align="center"> * </td>
      <td align="center"> * </td>
      <td align="center"> 2 years </td>
    </tr>
    <tr>
      <td align="center">&nbsp;</td>
      <td align="center"> day </td>
      <td align="center"> d_jobs_finished </td>
      <td align="center"> 2 years </td>
    </tr>
    <tr>
      <td align="center">&nbsp;</td>
      <td align="center">&nbsp;</td>
      <td align="center"> d_load </td>
      <td align="center"> 2 years </td>
    </tr>
    <tr>
      <td align="center">&nbsp;</td>
      <td align="center"> hour </td>
      <td align="center"> h_cpu </td>
      <td align="center"> 2 years </td>
    </tr>
    <tr>
      <td align="center">&nbsp;</td>
      <td align="center">&nbsp;</td>
      <td align="center"> h_jobs </td>
      <td align="center"> 2 years </td>
    </tr>
    <tr>
      <td align="center">&nbsp;</td>
      <td align="center">&nbsp;</td>
      <td align="center"> h_load </td>
      <td align="center"> 2 years </td>
    </tr>
    <tr>
      <td align="center">&nbsp;</td>
      <td align="center"> raw values </td>
      <td align="center"> cpu </td>
      <td align="center"> 7 days </td>
    </tr>
    <tr>
      <td align="center">&nbsp;</td>
      <td align="center">&nbsp;</td>
      <td align="center"> mem_free </td>
      <td align="center"> 7 days </td>
    </tr>
    <tr>
      <td align="center">&nbsp;</td>
      <td align="center">&nbsp;</td>
      <td align="center"> np_load_avg </td>
      <td align="center"> 7 days </td>
    </tr>
    <tr>
      <td align="center">&nbsp;</td>
      <td align="center">&nbsp;</td>
      <td align="center"> virtual_free </td>
      <td align="center"> 7 days </td>
    </tr>
    <tr>
      <td align="left">job </td>
      <td align="center"> * </td>
      <td align="center"> * </td>
      <td align="center"> 1 year </td>
    </tr>
    <tr>
      <td align="left">job_log </td>
      <td align="center"> * </td>
      <td align="center"> * </td>
      <td align="center"> 1 month </td>
    </tr>
    <tr>
      <td align="left">project_values </td>
      <td align="center"> * </td>
      <td align="center"> * </td>
      <td align="center"> 2 years </td>
    </tr>
    <tr>
      <td align="center">&nbsp;</td>
      <td align="center"> hour </td>
      <td align="center"> h_jobs_finished </td>
      <td align="center"> 2 years </td>
    </tr>
    <tr>
      <td align="left">queue_values </td>
      <td align="center"> * </td>
      <td align="center"> * </td>
      <td align="center"> 2 years </td>
    </tr>
    <tr>
      <td align="center">&nbsp;</td>
      <td align="center"> hour </td>
      <td align="center"> h_utilized </td>
      <td align="center"> 2 years </td>
    </tr>
    <tr>
      <td align="center">&nbsp;</td>
      <td align="center"> raw value </td>
      <td align="center"> slots </td>
      <td align="center"> 1 month </td>
    </tr>
    <tr>
      <td align="center">&nbsp;</td>
      <td align="center">&nbsp;</td>
      <td align="center"> state </td>
      <td align="center"> 1 month </td>
    </tr>
    <tr>
      <td align="left">share_log </td>
      <td align="center"> * </td>
      <td align="center"> * </td>
      <td align="center"> 1 year </td>
    </tr>
    <tr>
      <td align="center">&nbsp;</td>
      <td align="center"> raw value </td>
      <td align="center"> user1 </td>
      <td align="center"> 1 month </td>
    </tr>
    <tr>
      <td align="left">user_values </td>
      <td align="center"> * </td>
      <td align="center"> * </td>
      <td align="center"> 2 years </td>
    </tr>
    <tr>
      <td align="center">&nbsp;</td>
      <td align="center"> day </td>
      <td align="center"> d_jobs_finished </td>
      <td align="center"> 2 years </td>
    </tr>
    <tr>
      <td align="center">&nbsp;</td>
      <td align="center"> hour </td>
      <td align="center"> h_jobs_finished </td>
      <td align="center"> 2 years </td>
    </tr>
  </tbody>
</table>
</div>

<h2>
   <font color="#336699">
      Estimating the space requirements
   </font>
</h2>
<p>With the knowledge of how the dbwriter module works, it is possible
to estimate the space requirements of an ARCo database. Attached to this
article is a spreadsheet which contains all the formulas required for such a
calculation. The administrator only needs to enter the specific parameters of
the Grid Engine cluster. Please note the additional comments which have been
added as notes to the affected cells, such as descriptions of configuration
parameters. Cells with added notes have a small, red box in the upper right
corner.</p>
<p>The calculation is based on the default configuration of the Grid Engine
product. The results may differ if the product configuration changes.<br>
The average space usage of each row of a database table has been taken
from the data dictionary of an Oracle 9i database which has been filled
by the dbwriter module. Due to the complexities of database
configuration, this spreadsheet can give only a guideline for how much
space will be required for such a database system.</p>

<h2>
   <font color="#336699">
      Attachments:
   </font>
</h2>
<p>
   <a name="spreadsheet"></a>
   <a href="http://gridengine.sunsource.net/howto/arco/arco_db_size.ods">
      arco_db_size.ods
   </a>
</p>
</div>
</body>
</html>
